What is Claimed is: 



1 v3<^ A computer-implemented method for randomly walking through 

2 a hypertext-link ed document set comprising a plurality of documents, 

3 wherein at^east a subset of the documents contain a plurality of links to 

4 Other documents, ^ch document being associated with a host, the method 

5 comprising: \ \ \ 

6 a) sellctinj^ host; 

7 b) seMcQng at random a document associated with the host; 

8 c) retrieving the selected document; 

9 d) selectmg at random a link in the retrieved document; 

10 e) retrieving a document referenced by the selected link; and 

11 f) repeatingXd) and e) until a predetermined condition is met. 

1 JkI^A 2. The method of claim 1, further comprising, prior to d): 

A \ 

2 \^ c.l) \ responsive to a random event: 

3 selecting at random a host from among the previ- 

4 \ ously selected hosts; and 

5 c.l.2\ repeating b) through f); 

6 and wherein f) comprises repeating c.l) through e) until a predeter- 

7 mined condition is met \ 

1 3. The method of claim iMurther comprising, prior to d): 

2 c.l) generating a random number; 
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3 C.2) determining whether the random number falls within a 

4 prWetermined tange; and 

5 C.3) resAonsive to the random number falling within the prede- 

6 ter mined range: 

7 c.1.1) \ selecting at random a host from among the previ- 

8 \ ously selected hosts; and 

9 ' G,1.2) \ repeating b) through f). 



1 4. The method of claim 1, wherein the document set is the World 

2 Wide Web, and wherein eaqh document is a web page. 

1 5, The method of claim 4, wherein each host corresponds to a do- 

2 main. \ 

1 6. The method of claim l\ further comprising, concurrently with a) 

2 through f), performing a second ^o-level random walk through the hyper- 

3 text-linked document set. \ 

1 A computer-implemented method for randomly walking through 

2 a hypertext-linked document set comprising a plurality of documents, 

3 wherein at least a subset of the documents contain a plurality of links to 

4 other documents, each document being associated with a host, the method 

5 comprising: \ 

6 a) initializing a host set; \ 

7 b) initializing a document set for each host in the host set; 

8 c) selecting at random a host from the host set; 
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9 ^) \ selecting at random a document from the document set of 

10 \ the selected host; 

11 e) \ adding the selected host to the host set; 

12 f) \adding the selected document to the document set of the se- 

13 Pected host; 

14 g) • responsive to the selected document containing at least one 

15 link: 

16 g.l) \ selecting at random a link from the selected doc- 

17 \ ument; 

18 g.2) \ selecting a document corresponding to the selected 
29 \link; 

20 g,3) selecting a host corresponding to the selected doc- 

21 ument; 

22 g,4) reputing e) through h) until a predetermined 

23 condi^on is met; and 

24 h) responsive to the selected document not containing at least 

25 one link, repeatinjg c) through h) until a predetermined con- 

26 dition is met. 

1 8. The method of claim 7, wherein: 

2 e) is performed responsive to th\ selected host not being in the host 

3 set; and \ 

4 f) is performed responsive to the selected document not being in the 

5 document set of the selecte<d host. 

1 9. The method of claim 7, wherein g) further comprises, prior to g.l): 
Case 3792 - 23 - ^^^^^ 

V 



2 tg.O) responsive to a random event, repeating c) through h) until 

3 '\ a predetermined condition is met; 

4 and wherein g.l) through g.4) are performed responsive to non-occur- 

5 rence of the random event of g.O). 

1 10. TheVethod of claim 7, further comprising, prior to g.l): 

2 g-0.1) glenerating a random number; 

3 determining whether the random number falls within a 

4 predetermined range; and 

5 g-0.3) resporisive to the random number falling within the prede- 

6 termined range, repeating c) through h) until a predeter- 

7 mined condition is met; 

8 and wherein g.l) through g.4) are performed responsive to the ran- 

9 dom number not falling witl^in a predetermined range. 

1 11. The method of claim\7, wherein the hypertext-linked document 

2 set is the World Wide Web, and wherein each document is a web page. 

1 12. The method of claim 11, wherein each host corresponds to a do- 

2 main. 

1 A computer-implemented method for measuring relative quality 

2 of a search engine index, comprising: \ 

3 a) performing a two-level ranaom walk among documents 

4 within a document set; \ 
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b) \ for each document encountered in the random walk, deter- 
ining whether the document is indexed by the search en- 

gine index; and 

\ 

c) aggregating the results of b). 



14. The method of claim 13, wherein at least a subset of the docu- 
ments contain a plur^ity of links to other documents, each document being 
associated with a host, and wherein a) comprises: 

a.l) selecting\a host; 

a. 2) selecting random a document associated with the host; 
a.3) retrieving tl^ selected document; 
a.4) selecting at random a link in the retrieved document; 
a.5) retrieving a document referenced by the selected link; and 
a.6) repeating a.4) and a.5) until a predetermined condition is 
met. 



15. The method of claim 14, further comprising, prior to a.4): 

\ 

a.3.1) responsive to a random event: 

a.3.1.1) selecting at random a host from among the previ- 
ously selected\hosts; and 
a.3.1.2) repeating a.2) through a.6). 



16. The method of claim 13, wherein at least a subset of the docu- 

\ 

ments contain a plurality of links to other dpicuments, each document being 
associated with a host, and wherein a) comprises: 



a.l) initializing a host set; \ 

\ 
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5 a.2X initializing a document set for each host in the host set; 

6 a.3) \ selecting at random a host from the host set; 

7 a.4) >y selecting at random a document from the document set of 

8 \ the selected host; 

9 a.5) \ adding the selected host to the host set; 

10 a.6) adding the selected document to the document set of the se- 

11 lected host; 

12 a.7) responsive to the selected document containing at least one 
33 linfe 

14 a.7.1)\ selecting at random a link from the selected doc- 

15 \ ument; 

16 a.7.2) \ selecting a document corresponding to the selected 

17 Vink; 

\ 

18 a.7.3) selecting a host corresponding to the selected doc- 

19 ument; 

\ 

20 a.7.4) repeating a.5) through a.8) until a predetermined 

21 condition is met; and 

22 a.8) responsive to thAselected document not containing at least 

23 one link, repeatinAa.3) through a.8) until a predetermined 

24 condition is met. \ 

w 

\ 

1 17. The method of claim 16, wheirein: 

\ 

2 a.5) is performed responsive to the\selected host not being in the host 

\ 

3 set; and 

4 a.6) is performed responsive to the selbcted document not being in the 

\ 

5 document set of the selected host. 
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18. The method of claim 13, wherein each document contains a plu- 
rality of worlds, and wherein b) comprises, for each document encountered in 
the random walk: 

b.l) \selecting at least one word from the document; 
b.2) performing a query on the search engine index based on the 

sekcted at least one word, to obtain search results; and 
b.3) determining whether the document is included in the ob- 

tainedvsearch results. 

19. The method of claim 18, wherein b.l) comprises selecting at least 
one word based on rarity. 

A computer-implemented method for measuring relative quality 
of a document in a document set\ comprising: 

a) performing a two-level random walk among documents 
within a document \et; and 

b) determining a quality^ metric responsive to the number of 
times the document is encountered in the random walk. 



^2ir A computer-implemented method for measuring relative quality 
of a document in a document set comprising a plurality of documents, 
wherein at least a subset of the documents contain a plurality of links to 
other documents, the method comprising: 

a) performing a two-level random\walk among documents 

within a document set; and 
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b) determining a quality metric responsive to the number of 

documents that link to the document. 

22. The method of claim 21, wherein b) comprises deternnining a qual- 
ity metric responsive to the number of documents that link to the docu- 
ment, and responsive to the quality metric of the linking documents. 

23. The method\pf claim 21, v^herein b) comprises determining a 
value for: 

R(p) = d / r + (1 - d)X RiPi ) / C(/?, ) 

where: 

T is the total number oif\^documents in the document set; 
d is a damping factor such that 0 < d < 1; 

documents pi, ... , p^ each^contain at least one link to document p; and 
C(p) is the number of links out of p. 



24. The method of claim 21, wherein each document is associated 
with a host, and wherein a) comprises^i 
a.l) selecting a host; 

a.2) selecting at random a document associated with the host; 
a.3) retrieving the selected document; 
a.4) responsive to a random e\^nt: 

a.4.1) selecting at random a host from among the previ- 

\ 

ously selected hosts; and 
a.4.2) repeating a.2) through a.7); 
a.5) selecting at random a link in me retrieved document; 
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31 a.6) Vetrieving a document referenced by the selected link; and 

12 a,7) r^eating a.4) to a.6) until a predetermined condition is met. 

1 25. The method of claim 21, wherein each document is associated 

2 ' with a host, and wherein a) comprises: 

initializing a host set; 

initializing a document set for each host in the host set; 
selecting\at random a host from the host set; 
responsive to a random event: 

a.4.1) selecting at random a host from among the previ- 
ously selected hosts; and 
a.4.2) repeating a.2) through a.7). 

selecting at random a document from the document set of 
the selected host\^ 

adding the selected host to the host set; 
adding the selected document to the document set of the se- 
lected host; 

responsive to the selected document containing at least one 
link: 

a.8.1) selecting at random a link from the selected doc- 
ument; 

a.8.2) selecting a document corresponding to the selected 
link; 

a.8.3) selecting a host corresp^(^nding to the selected doc- 
ument; and 





3 


a.l) 




4 


a.2) 




5 


a.3) 
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a.4) 
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a.8.4) repeating a.6) through a.9) until a predeterrmned 
condition is met; and 



a.9) responsive to the selected document not containing at least 
one link, repeating a.3) through a.9) until a predetermined 
condition is met. 



26. The method o£ claim 21, further comprising: 

c) determining a quality metric for at least one additional doc- 
ument; and\ 

d) ranking the quality metric of the first document with respect 
to the quality metrics of the additional documents. 

A computer-implemented method for randomly walking through 
a hypertext-linked document set Comprising a plurality of documents, 
wherein at least a subset of the documents contain a plurality of links to 
other documents, each document bbing associated with a host, the method 
comprising: 

a) selecting a host; 

b) selecting at random a document associated with the host; 

c) retrieving the selected document; 

d) responsive to a random event: 

d.l) selecting at random a host from among the previ- 
ously selected hosts; and 

d.2) repeating b) through e) until a predetermined con- 
dition is met 

e) responsive to the random event ^not occurring: 
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25 e\l) selecting at random a link in the retrieved docu- 

26 \ ment; 

27 e.2) \ retrieving a document referenced by the selected 

18 \ link; and 

19 e.3) \ repeating d) and e) until a predetermined condi- 

20 \tion is met. 



2 JZ^ A computer-implemented method for measuring relative quality 

2 of a document in a document^set comprising a plurality of documents, 

3 wherein at least a subset of the documents contain a plurality of links to 
%Q 4 other documents, the method comprising: 

vP 5 a) performing a two-level random walk among documents 

1^ 6 within a documer^t set, the two-level random walk compris- 

8 a.l) initializing a host set; 

9 a.2) initializing a document set for each host in the host 
5 W set; 

22 a. 3) selecting at random^ a host from the host set; 

22 a.4) responsive to a random event: 

23 •a.4.1) selecting at random a host from among the 
14 previously selected hosts; and 

25 a.4.2) repeating a.2) through a.7). 

26 a.5) selecting at random a document from the document 

27 set of the selected host; 

28 a.6) adding the selected host to the host set; 
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19 a.7) adding the selected document to the document set of 

20 theVelected host; 

21 a.8) responsive to the selected document containing at 

22 leastlone link: 

23 a.8.1)\ selecting at random a link from the selected 

24 \ document; 

25 a.8.2) \ selecting a document corresponding to the 

26 \selected link; 

\ 

27 a.8.3) selecting a host corresponding to the se- 

28 lected document; 

:r \ • 

Li 29 a.8.4) repeating a.6) through a.9) until a predeter- 

S \ 

^ 30 mined condition is met; and 

f"! 31 a.9) responsive to tli^ selected document not containing at 

32 least one link, repeating a.3) through a.9) until a pre- 

O 33 determined condition is met; 

^? \ 

Ij 34 b) determining a quality metric responsive to the nvimber of 

m \ 

ijj 35 documents that Unk to the document; 

36 c) determining a quality metric for at least one additional doc- 

37 ument; and \ 

38 d) ranking the quality metric of me first document with respect 

39 to the quality metrics of the additional documents. 

1 A computer program product comprising a cbmputer-usaj)^ 



2 medium having computer-readable code embodied ihereirrfor^^domly 

3 walking through a hypertext-linked document set comprising a prarality of 

4 documents, wherein at least a subset of the documemsW a plurality of 
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\ 

5 links to other documents, each document being associated with a host, the 

6 computer program\product comprising: 



7 a) computer-readable program code devices configured to cause 

8 a computer to select a host; 

9 b) computer-readable program code devices configured to cause 
W a computer to select at random a document associated with 
n the host)\^ 

^2 c) computer-readable program code devices configured to cause 

33 a computerUo retrieve the selected document; 

^ 24 d) computer-readable program code devices configured to cause 

s3 35 a computer to. select at random a link in the retrieved doc- 

36 ument; \ 

37 e) computer-readable program code devices configured to cause 
p 3S a computer to retrieve a document referenced by the selected 
U ^9 link; and \ 

J2! 20 f) computer-readable\program code devices configured to cause 

^ 23 a computer to repeat the operations of d) and e) until a pre- 

^0 22 determined conditioi^s met. 

3 30. The computer program product of claim 29, further comprising 

2 computer-readable program code devices configured to cause a computer to, 

3 prior to selecting at random a link in theVetrieved document: 

4 c.l) responsive to a random event: 

5 select at random a host from among the previously selected 

6 hosts; and \ 

7 repeat the operations of b) through f); 



Case 3792 



-33- 



8 and wherem the computer-readable program code devices configured 

9 to cause a computerslo repeat the operations of d) and e) until a predeter- 

10 mined condition is mW comprise computer-readable program code devices 

11 configured to cause a computer to repeat the operations of c.l) through e) un- 

12 til a predetermined condition is met. 

1 31. The computer program product of claim 29, further comprising: 

2 computer-readable program code devices configured to cause a com- 

3 puter to gen^ate a random number; 

4 computer-readable progWm code devices configured to cause a com- 
0 5 puter to determine whether the random number falls 

=^5 6 within a predete\mined range; and 

1=^ 7 computer-readable programVcode devices configured to cause a com- 

Q 8 puter to, responsive to the random number falling within 

r^i 9 the predetermined r^^ge: 

52 10 select at random a host from among the previously selected 

% 11 hosts; and \ 

'^0 12 repeat the operations of d) through f). 

2 32. The computer program produc\^f claim 29, wherein the docu- 

2 ment set is the World Wide Web, and wherein each document is a web page. 

1 33. The computer program product of claim 32, wherein each host 

2 corresponds to a domain. 

2 34. The computer program product of clain^29, further comprising 

2 computer-readable program code devices configured to cause a computer to. 
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3 concurrently wi^th the operations of a) through f), perform a second two- 

4 level randoni walk through the hypertext-linked docunnent set. 

1 h cornluter program product comprising a computer-usable 

2 medium having computer-readable code embodied therein for randomly 

3 v^alking through a twpertext-linked document set comprising a plurality of 

4 documents, wherein at least a subset of the documents contain a plurality of 

5 links to other documents, each document being associated with a host, the 

6 computer program pro^ct comprising: 

7 a) computer-readable program code devices configured to cause 

8 a computer to initialize a host set; 

S b) computer-readable program code devices configured to cause 

10 a computer to initialize a document set for each host in the 

n host set; 



12 c) computer-readable program code devices configured to cause 

13 a computer to select at random a host from the host set; 

14 d) computer-readable program code devices configured to cause 

15 a computer to select at random a document from the docu- 

16 ment set of the selected host; 

\ 

17 e) computer-readable program code devices configured to cause 

18 a computer to add tl^e selected host to the host set; 

19 f) computer-readable program code devices configured to cause 

20 a computer to add the\elected document to the document 

21 set of the selected host; 
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22 g) computer-readable program code devices configured to cause 

23 ^^aN^omputer to, responsive to the selected document contain- 

24 ing. at least one link: 

25 g,l) \ select at random a link from the selected docu- 

26 \ ment; 

27 g.2) \ select a document corresponding to the selected 

28 \ link; 

29 g.3) \select a host corresponding to the selected docu- 

30 ment; and 

\ 

31 g.4) repeat the operations of e) through h) imtil a pre- 

32 determined condition is met; and 

33 h) computer-readable program code devices configured to cause 

34 a computer to,Vesponsive to the selected document not con- 

35 taining at least one link, repeat the operations of c) through 

1 36. The computer program product of claim 35, wherein: 

\ 

2 the computer-readable program code devices configured to cause a 

\ 

3 computer to add the selected host to the host set operate re- 

4 sponsive to the selected host not being in the host set; and 

5 the computer-readable program code devices configured to cause a 

\, 

6 computer to add the selected document to the document set 

\ 

7 of the selected host operate s responsive to the selected docu- 

8 ment not being in the document set of the selected host. 



\ 
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1 37. The computer program product of claim 35; wherein computer- 

2 readable program code devices g) further comprise computer-readable pro- 

3 gram code devices configured to cause a computer to, prior to g.l): 

4 g.O) responsive to a random event, repeat the operations of c) 

5 through h) until a predetermined condition is met; 

6 and wherein computer-readable program code devices g) are config- 

7 ured to cause a computer to perform g.l) through g.4) responsive to non-oc- 

8 currence of the random event of g.O). 

i \ 

^ 1 38. The computer program product of claim 35, wherein computer- 

ro \ 

M= 2 readable program code devices g) further comprise computer-readable pro- 

r i 3 gram code devices configured to cause a computer to, prior to g.l): 
4 g-0-1) generate a randbm number; 

:S \ 

.:S; 5 g'0.2) determine whether the random number falls within a pre- 

6 determined range-^and 

7 g-0.3) responsive to the random number falling within the prede- 

8 termined range, repeat'^he operations of c) through h) until 

10 and wherein computer-readable program code devices g) are config- 

11 ured to cause a computer to perform g.l) through g.4) responsive to the ran- 

12 dom number not falling within a predetermined range. 
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1 39\ The computer program product of claim 35, wherein the hyper- 

2 text-linked document set is the World Wide Web, and wherein each docu- 

3 ment is a web page. 

V 

\ 

1 40. The computer program product of claim 39, wherein each host 

2 corresponds to\a^omain. 

1 JSt!^ A computer program product comprising a computer-usable 

2 medium having computer-readable code embodied therein for measuring 
^ 3 relative quality of a Search engine index, the computer program product 

4 comprising: 

i: 5 a) computer-readable program code devices configured to cause 

6 a computerv to perform a two-level random walk among 

7 documents within a document set; 



8 h) computer-readable program code devices configured to cause 

9 a computer to, fOT each document encountered in the ran- 
20 dom walk, deterim'^ whether the document is indexed by 

11 the search engine index; and 

12 c) computer-readable program code devices configured to cause 

13 a computer to aggregatevthe results of the operations of b). 

1 42. The computer program product of claim 41, wherein at least a sub- 

2 set of the documents coritain a plurality of ^nks to other documents, each 

3 document being associated with a host, and wherein the computer-readable 

4 program code devices configured to cause a computer to perform a two-level 

5 random walk comprise: 
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6 a.l) computer-readable program code devices configured to cause 

7 a computer to select a host; 

8 a.2) computer-readable program code devices configured to cause 

9 a computer to select at random a document associated with 

10 the ho^t; 

11 a. 3) computer-readable program code devices configured to cause 

12 a computer to retrieve the selected document; 

13 a.4) computer-readable program code devices configured to cause 

14 a computer \o select at random a link in the retrieved doc 

15 ument; 

16 a.5) computer-readable program code devices configured to cause 

17 a computer to retrieve a document referenced by the selected 
'^^ 18 link; and 

^ n 

0 19 a.6) computer-readable program code devices configured to cause 

0 20 a computer to repeat\the operations of a.4) and a.5) until a 
ri 21 predetermined condition is met. 

1 \ 

1 43. The computer program product of claim 42, further comprising 

2 computer-readable program code devices configured to cause a computer to, 

3 prior to selecting at random a link in the relieved document: 

4 a.3.1) responsive to a random event: 

5 select at random a host from among the previously selected 

6 hosts; and \ 

7 repeat the operations of a.2) through a.6). 
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44. The computer program product of claim 41, wherein at least a sub- 
set of the documents contain a plurality of links to other documents, each 
document being associated with a host, and wherein the computer-readable 
program code^devices configured to cause a computer to perform a two-level 
random walk comprise: 

a.l) computer-readable program code devices configured to cause 

a computer to initialize a host set; 
a.2) computer-readable program code devices configured to cause 
a conaputer to initialize a document set for each host in the 
host s^t; 

\ 

a.3) computer-readable program code devices configured to cause 
a computer to select at random a host from the host set; 

a.4) computer-readable program code devices configured to cause 
a computer \o select at random a document from the docu- 
ment set of th^ selected host; 

\ 

a. 5) computer-readable program code devices configured to cause 
a computer to ad^d the selected host to the host set; 

a.6) computer-readabl\program code devices configured to cause 
a computer to add the selected document to the document 
set of the selected host; 

\ 

a. 7) computer-readable program code devices configured to cause 
a computer to, responsi\e to the selected document contain- 
ing at least one link: 

a.7.1) select at random a link from the selected docu- 
ment; 
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26 a.7.2) select a document corresponding to the selected 

27 > link; 

28 a.7.3) select a host corresponding to the selected docu- 

29 "ment; 

\ 

30 a.7,4) repeat the operations of a.5) through a.8) until a 

31 predetermined condition is met; and 

32 a.8) computer-readable program code devices configured to cause 

33 a computer to\responsive to the selected document not con- 

34 taining at least one link, repeat the operations of a.3) 
p 35 through a.8) untilv a predetermined condition is met. 

w \. 

1 45. The computer program product of claim 44, wherein: 

2 the computer-readable program code devices configured to cause a 

p 3 computer to add the selected host to the host set are config- 

Q 4 ured to cause a computer to add the selected host responsive 

5 to the selected host not being in the host set; and 

6 the computer-readable program cod^evices configured to cause a 

^ 7 computer to add the selected\^ocument to the document set 

8 of the selected host are configured to cause a computer to 

9 add the selected document responsive to the selected docu- 
10 ment not being in the document ^et^f the selected host. 

1 46. The computer program product of claim 41, v^herein each docu- 

2 ment contains a plurality of v^ords, and wherein the computer-readable pro- 

3 gram code devices configured to cause a computer to, determine whether the 

4 document is indexed by the search engine index comprise computer-readable 
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5 program code devices configured to, for each document encountered in the 

6 random walkX 

7 b.l) \select at least one word from the document; 

8 b.2) f>erform a query on the search engine index based on the se- 

9 lected at least one word, to obtain search results; and 

10 b.3) determine whether the document is included in the ob- 

U tained ^^search results. 

1 47. The computer program product of claim 46, wherein the com- 

2 puter-readable program code^^evices configured to select at least one word 

3 from the document comprise computer-readable program code devices con- 

1 A computer program product comprising a computer-usable 

2 medium having computer-readable code embodied therein for measuring 

3 relative quality of a document in a docmnent set, the computer program 

4 product comprising: \^ 

5 computer-readable program code devices configured to cause a com- 

6 puter to perform a two-level Vandom walk among docu- 

7 ments within a document set; and 

8 computer-readable program code devices configured to cause a com- 

9 puter to determine a quality metric responsive to the num- 

10 her of times the document is encountered in the random 

11 walk. \ 
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^ A computer program product comprising a computer-usable 
medium havirig computer-readable code embodied therein for measuring 
relative quality of\a document in a document set comprising a plurality of 
documents, wherein at least a subset of the documents contain a plurality of 
links to other documents, the computer program product comprising: 

computer-readable program code devices configured to cause a com- 
puter to perform a two-level random walk among docu- 
ments within a document set; and 
computer-readable\program code devices configured to cause a com- 
puter to (determine a quality metric responsive to the num- 
ber of documents that link to the document. 



50. The computer program product of claim 49, wherein the com- 
puter-readable program code devices configured to cause a computer to de- 
termine a quality metric comprise computer-readable program code devices 
configured to cause a computer to d^etermine a quality metric responsive to 
the number of documents that link toythe document, and responsive to the 
quality metric of the linking documents. 



51. The computer program product\ of claim 49, wherein the com- 
puter-readable program code devices configured to cause a computer to de- 
termine a quality metric comprise computer-readable program code devices 
configured to cause a computer to deterinine ay value for: 

R{p) = d/T^il-d)f^Rip,)/Cip,) 
i=i 

where: 
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7 
8 
9 

10 



T is me total number of documents in the document set; 
d is a clamping factor such that 0 < d < 1; 

documents pi, ... , pk each contain at least one link to document p; and 
C(p) is the number of links out of p. 
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52. The computer program product of claim 49, wherein each docu- 
ment is associatedWith a host, and wherein the computer-readable program 
code devices configured to cause a computer to perform a two-level random 
walk comprise: 

a.l) compViter-readable program code devices configured to cause 

a computer to select a host; 
a.2) computfer-readable program code devices configured to cause 

a compuW to select at random a document associated with 

the host; 



a,3) 
a.4) 



a.5) 



computer-readable program code devices configured to cause 
a computer to retrieve the selected document; 
computer-readable program code devices configured to cause 
a computer to\responsive to a random event: 
a.4.1) select at random a host from among the previ- 
ously selected hosts; and 
a.4.2) repeat the operations of a.2) through a.7); 
computer-readable program code devices configured to cause 
a computer to select at random a link in the retrieved doc- 
ument; 
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20 a.6) computer-readable program code devices configured to cause 

21 a computer to retrieve a document referenced by the selected 

22 link; land 

23 a. 7) compmer-readable program code devices configured to cause 

24 a computer to repeat the operations of a.4) to a.6) until a pre- 

25 deternrLlned condition is met. 



53. The computer brogram product of claim 49, wherein each docu- 
ment is associated with amost, and wherein and wherein the computer- 
readable program code devices configured to cause a computer to perform a 
two-level random walk comprise: 

a.l) computer-readable program code devices configured to cause 

a computer to initialize a host set; 
a. 2) computer-readable program code devices configured to cause 
a computer to initialize a document set for each host in the 
host set; 

a.3) computer-readable program code devices configured to cause 
a computer to select at random a host from the host set; 

a.4) computer-readable^program code devices configured to cause 
a computer to, responsive to a random event: 
a.4.1) select at random a host from among the previ- 
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ously selected hosts; and 
a.4.2) repeat the operations of a.2) through a.7), 



a.5) computer-readable program code devices configured to cause 
a computer to select at random a document from the docu- 
ment set of the selected host; 
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computer-readable program code devices configured to cause 
a computer to add the selected host to the host set; 
computer-readable program code devices configured to cause 
a computer to add the selected document to the document 
set of the selected host; 
\ computer-readable program code devices configured to cause 
^ computer to, responsive to the selected document contain- 
ine at least one link: 

select at random a link from the selected docu- 
ment; 

select a document corresponding to the selected 
link; 

select a host corresponding to the selected docu- 
ment; and 

repeat the operations of a.6) through a.9) until a 
predetermined condition is met; and 
responsive to the selected document not containing at least 
one link, repeating the operations of a.3) through a.9) until a 
predetermined condition is met. 



The computer program product of claim 49, further comprising: 
computer-readable program code devices configured to cause 
a computer to determine a quality metric for at least one ad- 
ditional document; and 

computer-readable program code- devices configured to cause 
a computer to rank the quality metric of the first document 
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7 with respect to the quality metrics of the additional docu- 

8 ments. 

1 K computer program product comprising a computer-usable 

2 medium having computer-readable code embodied therein for randomly 

3 walking through a hypertLxt-linked document set comprising a plurality of 

4 documents, wherein at lealst a subset of the documents contain a plurality of 

5 links to other documents, each document being associated with a host, the 

6 computer program product comprising: 

7 a) computer-readable program code devices configured to cause 
B a computer to select a host; 

9 b) computer-readable program code devices configured to cause 
30 a computer to select at random a document associated with 
n the host; 

12 c) computer-readable program code devices configured to cause 

\ 

23 a computer to r^etrieve the selected document; 

24 d) computer-readable program code devices configured to cause 

25 a computer to, responsive to a random event: 

26 d.l) select at^random a host from among the previ- 

27 ously selected hosts; and 

28 d.2) repeat the^operations of b) through e) until a pre- 

29 determined condition is met 

\ 

20 e) computer-readable program code devices configured to cause 

22 a computer to, responsive to the random event not occur- 

22 ring: \ 

23 e.l) select at random a link in the retrieved document; 

\ 
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e.2) retrieve a document referenced by the selected 
link; and 

e.3) repeat the operations of d) and e) until a predeter- 
mined condition is met. 



A computer program product comprising a computer-usable 
medium having computer- readable code embodied therein for measuring 
relative quality of a docuir^ent in a document set comprising a plurality of 
documents, v^herein at least a subset of the documents contain a plurality of 
links to other documents, t le computer program product comprising: 



a) 



a.l) 
a.2) 



computer-readable program code devices configured to cause 
a computer to perform a two-level random v^alk among 
documents within a document set, the computer-readable 
program code devices comprising: 



a3) 



a.4) 



computer-readable program code devices configured 
to caJse a computer to initialize a host set; 
compilter-readable program code devices configured 
to caul^ a computer to initialize a document set for 
each host in the host set; 

computer-readable program code devices configured 
to cause! a computer to select at random a host from 
the host 'set; 

1 

computer-readable program code devices configured 
to cause a computer to, responsive to a random event: 



a.4.1) 'select at random a host from among the 



1 



previously selected hosts; and 
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22 a.4.2) \ repeat the operations of a.2) through a.7). 

23 a.5) computer-readable program code devices configured 

24 to cause a computer to select at random a document 

25 from the Idocument set of the selected host; 

26 a.6) computerlreadable program code devices configured 

27 to cause a\computer to add the selected host to the 

28 host set; \ . 

29 a.7) computer-readable program code devices configured 

30 to cause a computer to add the selected document to 

31 the document set of the selected host; 

32 a. 8) computer-reakiable program code devices configured 

33 to cause a computer to, responsive to the selected 

34 document containing at least one link: 

35 a.8.1) selecfi at random a link from the selected 

36 document; 

37 a.8.2) select a document corresponding to the se- 

38 lected li^k; 

39 a.8.3) select a host corresponding to the selected 

40 documei^t; 

41 a.8.4) repeat the ^operations of a.6) through a.9) un- 

42 til a predetermined condition is met; and 

43 a.9) computer-readable program code devices configured 

44 to cause a computer to\ responsive to the selected 

45 document not containing at least one link, repeat the 

46 operations of a.3) through a.9) until a predetermined 

47 condition is met; \ 
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b) computer-readable program code devices configured to cause 
a computer to determine a quality metric responsive to the 
number of documents that link to the document; 

c) computer-readable program code devices configured to cause 
a computer to determine a quality metric for at least one ad- 



ditional document; and 

i 

d) computer-readable program code devices configured to cause 

a computer to rank the quality metric of the first document 
v/ith respect Ito the quality metrics of the additional docu- 
ments. 

system for randomly w^alking through a hypertext-linked doc- 
ument set comprising a pluraliity of documents, wherein at least a subset of 
the documents contain a plurality of links to other documents, each docu- 
ment being associated v^ith a ho^st, the system comprising: 

a) a host selector; 

b) a random document selector, coupled to the host selector, 
for selecting at rindom a document associated with the host; 

c) a document retriever, coupled to the random document se- 
lector, for retrieving the selected document; and 

d) a link selector, coupled to the document retriever, for select- 
ing at random a link in the retrieved document; 



wherein the document retriever retrieves a document referenced by 
the selected link; 
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and wherein the link selector repeatedly selects at random a link and 
the document retriever repeatedly retrieves a document referenced by the se- 
lected link, until a predetermined condition is met. 



^ A system for measuring relative quahty of a search engine index, 
comprising: j 

a random v^alker, for performing a two-level random walk among 
documents within^ a document set; 

/ 

a determination module, coupled to the random walker, for, for each 
document encountered in the random walk, determining 
whether the document is indexed by the search engine in- 
dex; and 

a results aggregation moc ule, coupled to the determination module, 
for aggregating the results of the determination module. 

system for measuring relative quality of a document in a docu- 
ment set, comprising: 

a random walker, for pe] forming a two-level random walk among 

lin a document set; and 
coupled to the random walker, for deter- 
mining a quality metric responsive to the number of times 
the document is encountered in the random walk. 



documents wit 
a determination module. 
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